Adaptive Critic-based Neural Network Object Contact Controller for a Three-finger Gripper

نویسندگان

  • Gustavo Galan
  • S. Jagannathan
چکیده

MARS greenhouse needs on-board arms that are capable of manipulating objects such as plant trays, fruits, vegetables and so on. Grasping and manipulation of objects have been a challenging task for robots. It is important that the manipulator performs these tasks accurately and faster with out damaging the object. The complex grasping task can be defined as object contact control and manipulation subtasks. In this paper, object contact task is defined in terms of following a trajectory accurately so that the object to be grasped is in contact with the gripper. The proposed scheme consists of a feedforward action generating NN that compensates for the nonlinear gripper dynamics. The learning of this NN is performed on-line based on a critic signal so that a 3-finger gripper track a predefined desired trajectory, which is specified in terms of a desired position and velocity for object contact control. Novel weight tuning updates are derived for the action generating NN and a Lyapunov-based stability analysis is presented. Simulation results are shown for a 3-finger gripper making contact with an object. 1.Introduction Human fingers are used to lift an object by applying an adequate grasping force even when the weight and the friction coefficient of the object are unknown. Further, humans use their fingers to feel the texture of the object through tactile sensing so that the required force can be applied to lift the object without any slippage. Several methods for lifting objects using grippers have been proposed in the literature [4]. All the methods involve making contact by the gripper with the object along a predefined path so that the fingers touch the object to be grasped in the right location and orientation. The design of grippers to perform grasping an object is a complex and an expensive task. Most of the designs rely on feeding the robot with a variety of possible object patterns. Position, weight, orientation, and shape are the main object characteristics specified to a gripper so it can proceed in a predefined trajectory to reach the object, and to grasp the object with the proper force, while the integrity of the object is guaranteed through out the manipulation. This grasping task requires a sophisticated controller. 1 This research is supported in part by the Texas Space Grant Consortium Award #26-4315-01. One of the problems in grasping an object is the ability for the gripper to reach an object in different positions and locations [5,9]. For every time the object’s location and orientation varies a new trajectory for the gripper to follow is computed and executed. Given the trajectory, the robot’s gripper controller has to respond accurately for every new path it is planned for [5,9]. Several techniques and considerations have been developed and made for robots to identify and grasp objects. Visual recognition, object avoidance, moving targets, switching control, grasp quality measures, force feedback, and force control are concepts involving the grasping of objects [4,6-8]. Such techniques have been able to determine the proper trajectory the gripper has to follow in order to reach and grasp a determined object. However, the controllers presented in [4,6-8] are either heuristic or focus on object recognition rather than controlling the contact. This paper presents the design of a gripper controller so the trajectory defined by the planner for object contact and grasping can be accurately tracked to reach the object at the right location and orientation. Neural networks (NN) have been shown to be very effective for the control of nonlinear dynamical systems. In reinforcement learning or adaptive critic-based NN method, the learning is performed based on a performance measure from a critic instead of gradient information supplied in other NN methods such as backpropagation. In other words, the signal provided by the critic in adaptive critic schemes conveys much less information than the desired output required in supervised learning. Nevertheless, their ability to generate correct control actions makes adaptive critics important candidates where the lack of sufficient structure in the task definition makes it difficult to define apriori the desired outputs for each input, as required by supervised learning control [1]. In this paper, a novel adaptive critic neural network-based gripper controller is developed for object contact control. This action generating NN approximates the dynamics of the gripper so that the object can be approached and contacted without damaging the object. A critic signal is used to the action generating NN controller to tune the weights of the NN so that the action generating NN approximates the gripper dynamics. Closed-loop performance is guaranteed through learning algorithms proposed. In Section 2, a brief background on neural networks and stability of nonlinear system is presented. The dynamic modeling of a three-finger gripper used in our work is given in Section 3 along with the novel adaptive critic algorithm. Simulation results are included to illustrate the validity of the approach in Section 4. Section 5 presents the conclusions of this work. 2.Neural Network Background A general function ) ( ) ( s C x f ∈ can be approximated using a neural network with at least two-layers of appropriated weights given by ξ σ + = ) ( ) ( x V W x f T T , (2.1) whereW are the constant weights, ) ( x V T σ is the vector of activation functions, and ξ the error in the approximation. If V is selected as the identity matrix and the vector of activation functions are selected as basis functions, then a one-layer NN will result. Define the net output as ) (x W y φ = , (2.2) For a one-layer NN, for suitable approximation properties, ) (x φ must be basis. For instance, it is well known in the NN literature that radial basis functions form a basis [3]. In this paper, we will show as how to select basis functions using the physics of the gripper instead of selecting them in an arbitrary manner. Further, the tedium of solving analytically the regression matrix needed for each gripper as required in the conventional adaptive control is avoided. 2.1 Stability of Systems To formulate the controller, the following stability Notion is needed. Consider the nonlinear system given by ) , ( u x f x = ) (x h y = (2.3) where ) (t x is a state vector, ) (t u is the input vector and ) (t y is the output vector [3]. The solution is said to be uniformly ultimately bounded (UUB) if for all 0 0 ) ( x t x = there exists a 0 ≥ μ and a number ) , ( 0 x N η , such that η ≤ ) (t x for all N t k + ≥ 0 . 3. Modeling of a Three-Finger Gripper The dynamics of the a finger in a three-finger gripper, shown in Fig.1, obtained from [2] is expressed as e F x d x M + = + τ 2 , (3.1) where M is the mass matrix for all the moving parts, 2 d viscous friction factor, τ being the control input, and e F the coulomb friction force of the actuator gear system. The dynamics can be rewritten as τ 1 ) ( − + = M f x h , (3.2) with ) (h f being the non-linear function of the dynamics of the gripper given by ) ( 2 1 F x d M e + − . Our objective is to design a control input τ that guarantees a desired gripper motion. Given a smooth trajectory and if M is accurately known, the control input can be selected as ) ˆ ( ) ( 1 v r k f M v h − + = − τ , (3.3) where ) ( ˆ h f being an approximation value of the non linear function ) (h f ,with ] , , , , , , [ r e e x x x x h d d = , v k be the gain matrix, ) (t v being an auxiliary input, the desired and actual trajectories given by d x x, , and e e , being the error tracking in the position and velocity respectively defined as x x e d − = , and x x e d − = . Applying (3.3) in (3.2) and with zero auxiliary input, the tracking error system can be shown to be asymptotically stable. However, the viscous friction and the Coulomb frictional forces are not known and measurable and hence novel controller schemes are required. Figure 1: Schematic of a 3-Finger robot gripper. Define the filtered tracking error as e e r Λ + = , (3.4) where Λ being an approximately dimension matrix that is selected through pole placement. This selection must unsure that when the filtered tracking error converges to zero, the trajectory error ) (t e converges to zero. Common usage is to select Λ diagonal with large positive entries, Then (3.4) is a stable system so that ) (t e is bounded as long as the controller guarantees that the filtered tracking error ) (t r is bounded. Differentiating (3.4) using (3.1), the error dynamics can be expressed in terms of the filtered tracking error as d t v f r K r M h v + + + + − = ε ) ( ~ ) ( , (3.5) where f ~ (.) is the error in approximation defined as ) ( ) ( ˆ ~ h h f f f − = . Note that the closed loop filtered tracking error system is driven by the functional approximation error. In the ideal case when there is no error in the approximation and with no disturbances present in (4.1), the tracking error converges exponentially to zero demonstrating the tracking performance. On the other hand, if the functional approximation error is non-zero, then the stability of the closed-loop tracking error system needs to be guaranteed. In this paper, we assume that the dynamics of the gripper are unknown and a one-layer neural network is employed to approximate the dynamics given by the nonlinear function ) (h f . Further, by appropriately choosing the neural network weight updates, the stability of the closed-loop system is presented in the subsequent sections. 3.1Approximation-based Critic Neural Network Controller In the presence of bounded disturbances, the dynamics of the gripper are expressed a τ τ = + + + d e F x d x M 2 , (3.6) where d τ is a bounded disturbance added to the system given in (4.1) whose bound is limited by a known constant B d with B d d ≤ τ . Differentiating (3.4) and using it in (3.1), one obtains, e M x M x M r M d Λ + − = . (3.7) Rewriting (3.7) as e M F x d x M r M e d Λ + − − + = τ 2 , (3.8) where r e x x d − Λ − = , (3.9) Equation (3.8) represents the filtered tracking error dynamics as e M Fe r d e d x d x M r M d d Λ + − − − Λ + + = τ 2 2 2 . (3.10) Defining ) (h f as the unknown nonlinear dynamics for the one finger r d Fe e d x d e M x M f d d h 2 2 2 ) ( − − Λ + + Λ + = , (3.11) yields the filtered tracking error system as d f r M h + − = τ ) ( . (3.12) Let the control input for the finger to be selected as v r k f v h − + = ) ( ˆ τ (3.13) the closed-loop system dynamics is expressed as d v r k f f r M t v h h + + − − = ) ( ) ( ) ( ˆ , (3.14) or d v f r k r M t v + + + − = ) ( ~ , (3.15) where (.) ~ f is the error in approximation of the non-linear function (.) f h defined in (3.11). Here a one-layer action generating NN is used to approximate the unknown dynamics of the finger. The tunable NN weights enter in a linear fashion. Along with tunable algorithm, a secondary adaptive function (known as a critic signal) is developed by the Lyapunov stability analysis [1]. Assume therefore, that there exists a constant ideal set of weights W for a one-layer NN so that the nonlinear function can be written as ξ φ + = ) ( ) ( h W f T h , (3.16) where ) (h φ provides a suitable basis and N ξ ξ ≤ with the bound known. For suitable approximation properties, it is necessary to select a large enough number of hidden-layer neurons. A NN input vector can be chosen based on the function ) (h f it is trying to build [3]. One such basis vector is given by ] , , , , , , 1 [ T d d x x e e x x h = . 3.1 Adaptive Critic Controller Structure A choice of the critic signal is given by ρ σ + = ) (r P R , (3.17a) where P is a diagonal positive definite matrix, ) (r σ being the sigmoid term and ρ is an auxiliary critic signal which is defined later. Defining the action generating NN functional estimate by ) ( ˆ ) ( ˆ h W h f φ = , (3.17b) withŴ being the current value of the weights. The next step is to determine the weight updates so the performance of the closed-loop error dynamics of the gripper is guaranteed. Let W be the unknown ideal weights required for the approximation to hold in (2.1) and assume they are bounded by known values so that

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design of an Adaptive-Neural Network Attitude Controller of a Satellite using Reaction Wheels

In this paper, an adaptive attitude control algorithm is developed based on neural network for a satellite using four reaction wheels in a tetrahedron configuration. Then, an attitude control based on feedback linearization control is designed and uncertainties in the moment of inertia matrix and disturbances torque have been considered. In order to eliminate the effect of these uncertainties, ...

متن کامل

Input Displacement Neuro-fuzzy Control and Object Recognition by Compliant Multi-fingered Passively Adaptive Robotic Gripper

The requirement for new flexible adaptive grippers is the ability to detect and recognize objects in their environments. It is known that robotic manipulators are highly nonlinear systems, and an accurate mathematical model is difficult to obtain, thus making it difficult make decision strategies using conventional techniques. Here, an adaptive neuro fuzzy inference system (ANFIS) for controlli...

متن کامل

Adaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network

An adaptive version of growing and pruning RBF neural network has been used to predict the system output and implement Linear Model-Based Predictive Controller (LMPC) and Non-linear Model-based Predictive Controller (NMPC) strategies. A radial-basis neural network with growing and pruning capabilities is introduced to carry out on-line model identification.An Unscented Kal...

متن کامل

Asymptotic tracking by a reinforcement learning-based adaptive critic controller

Adaptive critic (AC) based controllers are typically discrete and/or yield a uniformly ultimately bounded stability result because of the presence of disturbances and unknown approximation errors. A continuous-time AC controller is developed that yields asymptotic tracking of a class of uncertain nonlinear systems with bounded disturbances. The proposed AC-based controller consists of two neura...

متن کامل

Midcourse guidance law with neural networks

A dual neural network ‘adaptive critic approach’ is used in this study to generate midcourse guidance commands for a missile to reach a predicted impact point while maximizing its final velocity. The adaptive critic approach is based on approximate dynamic programming. The first network, called a ‘critic’, network, outputs the Lagrangian multipliers arising in an optimal control formulation whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001